Roblox Forum Archive Beta

(Public mirrors: https://www.reddit.com/r/roblox/comments/7xg6ps/roblox_forum_archive_beta/, Roblox Forum Archive Beta)

I archived the entire Roblox forum last month and it is now available for public access at https://archive.froast.io. The source of this archive is available. Read below for more info.

First of all I’d like to thank the contributors who made this project possible. This was not cheap at all in terms of human and technical resources. I really appreciate donations and contributions from all of the following (and more).

  • Metsfan2009
  • Ravenshield
  • Jaxon (anonymous - did not make contact with me after donating)
  • CPTKurkIV
  • FractalGeometry
  • Happywalker
  • Growncool7
  • sanjay2003

Continued donations will allow me to keep the site ad-free and possibly add more features. You can donate on the site.

I did this for the community and will not keep this archive for myself to abuse. You will not be charged for viewing any posts or making any searched; however, please do not run bots on my website. If you want to get the archive for yourself, you can download it without slowing down the website for everyone else (see the Source section). I would even like to make the website code open source when I have the chance to.

The archive is currently in beta but the beta is mature: it is in beta because I have not had the time to add many features that I would like to. A list of currently available features and planned features is below.

Screenshots

Introduction I was disappointed when Roblox announced that the forums were closing and that they weren't leaving a read-only copy of it anywhere on the internet. A lot of the forum deserved to be deleted but much of it is real history of real people that deserves to be preserved. Even though I learned about the closure late and had very little time, I quickly and sloppily wrote a program to archive the forum.

I succeeded in archiving the entire forum and was left with tens of millions of little files containing hundreds of millions of posts. The reason I am releasing this archive over a month after the forum shut down (and the archive itself was complete) is because I’ve been working on a website to actually make the archive accessible (and school started after December which meant I didn’t have many chances to work on it).

I did not archive the forum just for the sake of archiving it. Although this is part of the reason, a big reason is allowing people easy access to any posts they may want to see or need from the past. Another archive was supposedly made by ArchiveTeam but this archive remains difficult to access, does not preserve the structure of the whole forum, and has no viable way of being searched. My archive actually fulfilled another project I had thought of before but never tried: an actual forum search.

Roblox’s searching by keyword didn’t work and searching by user barely worked, meaning posts that were not tracked or in their recently posted were essentially lost if the ID had not been recorded. The big feature I have added is the ability to view any single user’s posts. You can go back and time to see the first post you made and jump to anywhere in between. I would like to add searching by keyword as well but have not had much time and I do not want to delay the release of the archive any longer.

SourceAll files, including raw files, are stored in cheap archive storage indefinitely. The AWS glacier vault, however, is difficult to access and will not be made public for the time being. Right now, the files are also stored on AWS EBS drives. Although much easier to access, storage costs are very expensive. Funds are limited: if you plan on working with the raw files, you are highly recommended to download them now. Once the SQL dump of the archive is released, it will remain available for longer.

Even if you don’t need the files, I encourage others to download and spread the source files of the archive to others.

The following data are not available in this archive but consists of the most trivial pieces of forum posts:

  • How many views the post has
  • Whether the post was marked “popular”
  • Whether the post was locked or not
  • Whether the post was pinned or not

Raw Every _thread_, including all its pages, was saved to its own file in the format `postId-page.gz` (for example, `20128158-1.gz`). This was done on 10 servers for 10 post ranges onto 10 different hard disks, all of which are on AWS.

Below are the snapshot IDs for every single drive. Each drive is 100GB with about 50GB used on each. They each contain ranges of 23 million posts, therefore the first has posts 23 million through 46 million, the second has posts 46 million through 69 million, etc.

  1. snap-08acac73d5c493f0e
  2. snap-01ab1ecbbf927da8d
  3. snap-020e0bac39caed5c5
  4. snap-058d5c6fa4586f253
  5. snap-0220d0241806fb57c
  6. snap-005056bd489ad8f9a
  7. snap-0fdf1dcaca43114cc
  8. snap-02a98683b2874d0ff
  9. snap-064920a7e4cd5c612
  10. snap-0e8b4974f0f4bcbb1
CompiledThis is a compilation of the 10 drives into one drive of 500GB. It contains 11 tar files with the format `archive[number].tar`. Each archive contains 23 million posts. The 11th file is a compilation of all log files from both the archive process as well as the import process into a database.

The snapshot ID is: snap-01cb07f8ec7df3c9b

SQLThis is by far the easiest form to work with. If you do not need to work with the raw files, you should just use the SQL database.

A SQL database exists but I won’t be able to provide a dump right now. I will release information once I can produce one.

Features
  • View all threads and posts made on the forum, including all pages of threads (that existed on December 21st)
  • View threads made in subforums
  • Search for posts made by a user
  • Search for posts starting at a certain date for a user
  • Roblox links are clickable, Roblox forum links will go to the archived post instead of Roblox

(Planned)

  • Search for posts by keyword (posts in subforums, by a specific user, all posts, etc.) This may or may not be viable.
  • Sort subforums by ascending or descending
  • Search for posts after a certain date
    (Would like to have)
  • Go to a specific page in a subforum (seems to be unviable with the current schema)
Known Issues
  • Posts with over 1000 pages will not display page counts correctly. All posts will still exist and every page can be accessed, but the page counter will only display the last three digits of its actual total pages. This pretty much exclusively effects botted posts and isn’t a huge deal. I plan to fix this eventually.
  • Some posts with hundreds of pages were not fully saved. Most, if not all of these are botted posts.
  • Some posts were missed. It’s unsure how many, but so far I have only one confirmed post that should have been saved but was not.

Feel free to comment with questions or suggestions that are not already planned features.

51 Likes

Very impressive! Glad to see that such an ambitious project went through with flying colors! I can see this being used a lot, from research on Roblox’s history to looking up odd scripting questions that many people have pondered over in the past, but aren’t really available in the open anymore.

Also as a small note, I think that some sort of auto-filter caught your Reddit mirror. May want to talk to the mods there about manually approving your post.

4 Likes

Thanks! And thanks for letting me know about the reddit post, I’m guessing it has to be manually approved or something, I’ll get another temporary mirror up for now.

Edit: Reddit post approved now!

3 Likes

Wow, thanks for doing this!

1 Like

This is cool as hecc! I like being able to search by user. Interesting to see my oldest, cringy forum posts.

3 Likes

Amazing!

1 Like

Amazing!

2 Likes

Amazing!

1 Like

jesus christ i was so cringy on previous accounts.

It’s pretty depressing that i made this post when i was 15 and now i don’t have the slightest clue what any of it means. But knowing myself, this is probably filled with mistakes. Anyone want to check it out?

1 Like

Woah! Amazing work, I’ll be sure to use this.

I can easily see myself spending hours of my life browsing old threads on this.

Also, happy cake day!

1 Like

The search feature works better than the search feature on the real Roblox forums did!
Can I share the site with non-Devforum members? I know some people that could greatly benefit from this

1 Like

My hero.
Props to you

Looking at the OP, yes. There’s a public reddit link.

Please do! The archive is for everyone

Thank you thank you thank you.

1 Like

My dude, you are incredible. Thank you!

If this is possible you would achieve true perfection, but nonetheless this is immensely helpful so thank you for making it.

Thankyou so much